Overview

  • TensorFlow Basics

  • R Interfaces to TensorFlow

  • Deep Learning

  • Supporting Tools

  • Deployment

  • Learning More

What is TensorFlow?

A general purpose numerical computing library

Why should R users care?

  • A new general purpose numerical computing library!
    • Hardware independent
    • Distributed execution
    • Large datasets
    • Automatic differentiation
  • Robust foundation for many deep learning applications

  • TensorFlow models can be deployed with a low-latency C++ runtime

  • R has a lot to offer as an interface language for TensorFlow

Example: Greta

Writing statistical models and fitting them by MCMC

Greta Air Model

# Greta
theta = normal(0, 32, dim = 2)
mu <- alpha + beta * Z
X = normal(mu, sigma)
p <- ilogit(theta[1] + theta[2] * X)
distribution(y) = binomial(n, p)
# BUGS/JAGS
for(j in 1 : J) {
   y[j] ~ dbin(p[j], n[j])
   logit(p[j]) <- theta[1] + theta[2] * X[j]
   X[j] ~ dnorm(mu[j], tau)
   mu[j] <- alpha + beta * Z[j]
}
theta[1] ~ dnorm(0.0, 0.001)
theta[2] ~ dnorm(0.0, 0.001)

What are tensors?

Data stored in multidimensional arrays

Dimension R object
0D 42
1D c(42, 42, 42)
2D matrix(42, nrow = 2, ncol = 2)
3D array(42, dim = c(2,3,2))
4D array(42, dim = c(2,3,2,3))

Some examples

  • Vector data—2D tensors of shape (samples, features)

  • Timeseries or sequence data—3D tensors of shape (samples, timesteps, features)

  • Images—4D tensors of shape (samples, height, width, channels) ï‚¡

  • Video—5D tensors of shape (samples, frames, height, width, channels)

2D Tensors

Vector data

head(data.matrix(iris), n = 10)
      Sepal.Length Sepal.Width Petal.Length Petal.Width Species
 [1,]          5.1         3.5          1.4         0.2       1
 [2,]          4.9         3.0          1.4         0.2       1
 [3,]          4.7         3.2          1.3         0.2       1
 [4,]          4.6         3.1          1.5         0.2       1
 [5,]          5.0         3.6          1.4         0.2       1
 [6,]          5.4         3.9          1.7         0.4       1
 [7,]          4.6         3.4          1.4         0.3       1
 [8,]          5.0         3.4          1.5         0.2       1
 [9,]          4.4         2.9          1.4         0.2       1
[10,]          4.9         3.1          1.5         0.1       1

3D Tensors

Timeseries or sequence data

4D Tensors

Image data

What is tensor "flow"?

A dataflow graph with nodes representing units of computation

  • Parallelism–System identifies operations that can execute in parallel.
  • Distributed execution–Graph can be partitioned accross multiple devices.
  • Compilation–Use the information in your dataflow graph to generate faster code (e.g. fusing operations)
  • Portability–Dataflow graph is a language-independent representation of the code in your model (deploy with C++ runtime)

R Interface to Tensorflow

  • High-level R interfaces for neural nets and traditional models

  • Low-level interface to allow enable new applications (e.g. Greta)

  • Tools to faciliate productive workflow / experiment management

  • Easy access to GPUs for training models

  • Breadth and depth of educational resources

TensorFlow APIs

Distinct interfaces for various tasks and levels of abstraction

R Packages

TensorFlow APIs

  • keras—Interface for neural networks, with a focus on enabling fast experimentation.
  • tfestimators— Implementations of common model types such as regressors and classifiers.
  • tensorflow—Low-level interface to the TensorFlow computational graph.
  • tfdatasets—Scalable input pipelines for TensorFlow models.

Supporting Tools

  • tfruns—Track, visualize, and manage TensorFlow training runs and experiments.
  • tfdeploy—Tools designed to make exporting and serving TensorFlow models straightforward.
  • cloudml—R interface to Google Cloud Machine Learning Engine.

  • High-level neural networks API capable of running on top of TensorFlow, CNTK, or Theano (and soon MXNet).

  • Allows for easy and fast prototyping (through user friendliness, modularity, and extensibility).

  • Supports both convolutional networks and recurrent networks, as well as combinations of the two.

  • Runs seamlessly on CPU and GPU.

  • https://keras.rstudio.com

Keras Adoption

Layers in Neural Networks

A data-processing module that you can think of as a filter for data

Layers in Neural Networks (cont.)

Layers implement a form of progressive data distillation

Keras Layers

A grammer for specifying the layers of a neural network

model <- keras_model_sequential() %>%
  layer_conv_2d(filters = 32, kernel_size = c(3,3), activation = 'relu',
                input_shape = input_shape) %>% 
  layer_conv_2d(filters = 64, kernel_size = c(3,3), activation = 'relu') %>% 
  layer_max_pooling_2d(pool_size = c(2, 2)) %>% 
  layer_dropout(rate = 0.25) %>% 
  layer_flatten() %>% 
  layer_dense(units = 128, activation = 'relu') %>% 
  layer_dropout(rate = 0.5) %>% 
  layer_dense(units = 10, activation = 'softmax')

Keras: Data Preprocessing

library(keras)

# Load MNIST images datasets (built in to Keras)
c(c(x_train, y_train), c(x_test, y_test)) %<-% dataset_mnist()

# Flatten images and transform RGB values into [0,1] range 
x_train <- array_reshape(x_train, c(nrow(x_train), 784))
x_test <- array_reshape(x_test, c(nrow(x_test), 784))
x_train <- x_train / 255
x_test <- x_test / 255

# Convert class vectors to binary class matrices
y_train <- to_categorical(y_train, 10)
y_test <- to_categorical(y_test, 10)

Keras: Model Definition

model <- keras_model_sequential()  %>% 
  layer_dense(units = 256, activation = 'relu', input_shape = c(784)) %>% 
  layer_dropout(rate = 0.4) %>% 
  layer_dense(units = 128, activation = 'relu') %>%
  layer_dropout(rate = 0.3) %>%
  layer_dense(units = 10, activation = 'softmax')

model %>% compile(
  loss = 'categorical_crossentropy',
  optimizer = optimizer_rmsprop(),
  metrics = c('accuracy')
)

Keras: Model Definition (cont.)

summary(model)
_____________________________________________________________________________________
Layer (type)                          Output Shape                      Param #      
=====================================================================================
dense_1 (Dense)                       (None, 256)                       200960       
_____________________________________________________________________________________
dropout_1 (Dropout)                   (None, 256)                       0            
_____________________________________________________________________________________
dense_2 (Dense)                       (None, 128)                       32896        
_____________________________________________________________________________________
dropout_2 (Dropout)                   (None, 128)                       0            
_____________________________________________________________________________________
dense_3 (Dense)                       (None, 10)                        1290         
=====================================================================================
Total params: 235,146
Trainable params: 235,146
Non-trainable params: 0
_____________________________________________________________________________________

Keras: Model Training

history <- model %>% fit(
  x_train, y_train,
  batch_size = 128,
  epochs = 30,
  validation_split = 0.2
)
history
Trained on 48,000 samples, validated on 12,000 samples (batch_size=128, epochs=30)
Final epoch (plot to see history):
     acc: 0.9057
    loss: 1.5
 val_acc: 0.9317
val_loss: 1.088 

Keras: Model Training (cont.)

plot(history)

Keras: Evaluation and Prediction

model %>% evaluate(x_test, y_test)
$loss
[1] 0.1078904

$acc
[1] 0.9815
model %>% predict_classes(x_test[1:100,])
  [1] 7 2 1 0 4 1 4 9 5 9 0 6 9 0 1 5 9 7 3 4 9 6 6 5 4 0 7 4 0 1 3 1 3 4 7
 [36] 2 7 1 2 1 1 7 4 2 3 5 1 2 4 4 6 3 5 5 6 0 4 1 9 5 7 8 9 3 7 4 6 4 3 0
 [71] 7 0 2 9 1 7 3 2 9 7 7 6 2 7 8 4 7 3 6 1 3 6 9 3 1 4 1 7 6 9

Keras: Demo

TensorFlow Estimators

High level API for TensorFlow models https://tensorflow.rstudio.com/tfestimators/

Estimator Description
linear_regressor() Linear regressor model.
linear_classifier() Linear classifier model.
dnn_regressor() Dynamic nueral network regression.
dnn_classifier() Dynamic nueral network classification.
dnn_linear_combined_regressor() DNN Linear Combined Regression.
dnn_linear_combined_classifier() DNN Linear Combined Classification.

TensorFlow Core API

Low level access to TensorFlow graph operations https://tensorflow.rstudio.com/tensorflow/

W <- tf$Variable(tf$random_uniform(shape(1L), -1.0, 1.0))
b <- tf$Variable(tf$zeros(shape(1L)))
y <- W * x_data + b

loss <- tf$reduce_mean((y - y_data) ^ 2)
optimizer <- tf$train$GradientDescentOptimizer(0.5)
train <- optimizer$minimize(loss)

sess = tf$Session()
sess$run(tf$global_variables_initializer())

for (step in 1:200) {
  sess$run(train)
  if (step %% 20 == 0)
    cat(step, "-", sess$run(W), sess$run(b), "\n")
}

Deep Learning

TODO: Deep learning section

Supporting Tools

tfruns package

Track, visualize, and manage training runs and experiments

  

cloudml package

tfdeploy package

Deployment: SavedModels

Deployment: TensorFlow Serving

Deployment: RStudio Connect

Deployment: CloudML

Learning more

https://tensorflow.rstudio.com/learn/

  • Recommended reading

  • Keras for R cheatsheet

  • Gallery and examples

  • Subscribe to the TensorFlow for R blog!

Recommended Reading

Keras for R cheatsheet

https://github.com/rstudio/cheatsheets/raw/master/keras.pdf

Gallery and examples

https://tensorflow.rstudio.com/learn/gallery.html

Thank you!

In Place Modification

Object semantics are not by-value! (as is conventional in R)

# Modify model object in place (note that it is not assigned back to)
model %>% compile(
  optimizer = 'rmsprop',
  loss = 'binary_crossentropy',
  metrics = c('accuracy')
)
  • Keras models are directed acyclic graphs of layers whose state is updated during training.

  • Keras layers can be shared by mutliple parts of a Keras model.